12 research outputs found

    TriAnnot: A Versatile and High Performance Pipeline for the Automated Annotation of Plant Genomes

    Get PDF
    In support of the international effort to obtain a reference sequence of the bread wheat genome and to provide plant communities dealing with large and complex genomes with a versatile, easy-to-use online automated tool for annotation, we have developed the TriAnnot pipeline. Its modular architecture allows for the annotation and masking of transposable elements, the structural, and functional annotation of protein-coding genes with an evidence-based quality indexing, and the identification of conserved non-coding sequences and molecular markers. The TriAnnot pipeline is parallelized on a 712 CPU computing cluster that can run a 1-Gb sequence annotation in less than 5 days. It is accessible through a web interface for small scale analyses or through a server for large scale annotations. The performance of TriAnnot was evaluated in terms of sensitivity, specificity, and general fitness using curated reference sequence sets from rice and wheat. In less than 8 h, TriAnnot was able to predict more than 83% of the 3,748 CDS from rice chromosome 1 with a fitness of 67.4%. On a set of 12 reference Mb-sized contigs from wheat chromosome 3B, TriAnnot predicted and annotated 93.3% of the genes among which 54% were perfectly identified in accordance with the reference annotation. It also allowed the curation of 12 genes based on new biological evidences, increasing the percentage of perfect gene prediction to 63%. TriAnnot systematically showed a higher fitness than other annotation pipelines that are not improved for wheat. As it is easily adaptable to the annotation of other plant genomes, TriAnnot should become a useful resource for the annotation of large and complex genomes in the future

    Curated genome annotation of Oryza sativa ssp. japonica and comparative genome analysis with Arabidopsis thaliana

    Get PDF
    We present here the annotation of the complete genome of rice Oryza sativa L. ssp. japonica cultivar Nipponbare. All functional annotations for proteins and non-protein-coding RNA (npRNA) candidates were manually curated. Functions were identified or inferred in 19,969 (70%) of the proteins, and 131 possible npRNAs (including 58 antisense transcripts) were found. Almost 5000 annotated protein-coding genes were found to be disrupted in insertional mutant lines, which will accelerate future experimental validation of the annotations. The rice loci were determined by using cDNA sequences obtained from rice and other representative cereals. Our conservative estimate based on these loci and an extrapolation suggested that the gene number of rice is ~32,000, which is smaller than previous estimates. We conducted comparative analyses between rice and Arabidopsis thaliana and found that both genomes possessed several lineage-specific genes, which might account for the observed differences between these species, while they had similar sets of predicted functional domains among the protein sequences. A system to control translational efficiency seems to be conserved across large evolutionary distances. Moreover, the evolutionary process of protein-coding genes was examined. Our results suggest that natural selection may have played a role for duplicated genes in both species, so that duplication was suppressed or favored in a manner that depended on the function of a gene

    Additional file 2: Table S1. of The Nipponbare genome and the next-generation of rice genomics research in Japan

    No full text
    Rice genes reported by Japanese researchers in various scientific journals in 2005–2014. Literatures were searched and obtained from PubMed with ‘rice’ and ‘Oryza’ as keywords in either the title or abstract, and further selected by natural language processing and manual curation. The data can be accessed from Oryzabase ( http://shigen.nig.ac.jp/rice/oryzabase/download/reference ). (XLSX 215 kb

    Transduction of RNA-directed DNA methylation signals to repressive histone marks in Arabidopsis thaliana

    Get PDF
    RNA-directed modification of histones is essential for the maintenance of heterochromatin in higher eukaryotes. In plants, cytosine methylation is an additional factor regulating inactive chromatin, but the mechanisms regulating the coexistence of cytosine methylation and repressive histone modification remain obscure. In this study, we analysed the mechanism of gene silencing mediated by MORPHEUS' MOLECULE1 (MOM1) of Arabidopsis thaliana. Transcript profiling revealed that the majority of up-regulated loci in mom1 carry sequences related to transposons and homologous to the 24-nt siRNAs accumulated in wild-type plants that are the hallmarks of RNA-directed DNA methylation (RdDM). Analysis of a single-copy gene, SUPPRESSOR OF drm1 drm2 cmt3 (SDC), revealed that mom1 activates SDC with concomitant reduction of di-methylated histone H3 lysine 9 (H3K9me2) at the tandem repeats in the promoter region without changes in siRNA accumulation and cytosine methylation. The reduction of H3K9me2 is not observed in regions flanking the tandem repeats. The results suggest that MOM1 transduces RdDM signals to repressive histone modification in the core region of RdDM

    The Rice Annotation Project Database (RAP-DB): 2008 update

    No full text
    The Rice Annotation Project Database (RAP-DB) was created to provide the genome sequence assembly of the International Rice Genome Sequencing Project (IRGSP), manually curated annotation of the sequence, and other genomics information that could be useful for comprehensive understanding of the rice biology. Since the last publication of the RAP-DB, the IRGSP genome has been revised and reassembled. In addition, a large number of rice-expressed sequence tags have been released, and functional genomics resources have been produced worldwide. Thus, we have thoroughly updated our genome annotation by manual curation of all the functional descriptions of rice genes. The latest version of the RAP-DB contains a variety of annotation data as follows: clone positions, structures and functions of 31 439 genes validated by cDNAs, RNA genes detected by massively parallel signature sequencing (MPSS) technology and sequence similarity, flanking sequences of mutant lines, transposable elements, etc. Other annotation data such as Gnomon can be displayed along with those of RAP for comparison. We have also developed a new keyword search system to allow the user to access useful information. The RAP-DB is available at: http://www.w3.org/1999/ http://rapdb.dna.affrc.go.jp/ and http://rapdb.lab.nig.ac.jp/
    corecore